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AESTRACT 

The problem of determining relative weights for 
quantity and quality in scoring foreign language speaking and writing 
fluency tests is studied. French speaking and writing fluency tests 
were administered to students of French in several schools in 
England. Data fro* these tests was analyzed to support the suggestion 
that scoring formulas should reflect two components of performance: 

(1) quantity cf correct response, and (2) relative quality of 
response. Two quantity and five quality variables were identified and 
correlated. Using a priori reasoning and the correlations, several 
scoring formulas were tried. The study and the cross-validation study 
indicate that nonlinear combinations of raw scores, probably ratios 
and products, may be needed. (IF,) 



O 

ERLC 



*rv~o jrzj 






i 





F. 

A 

R 




•4- 

-4* 

-4* 

■4* 




81 



UftOOA^ ‘iVTOfMMTIUOVCATlM 
kinum 
0*f*ct O* KOVCAnM 
twi$ oooMn has u« xhcoucio 
UACnTASRKtMOIIlOHnirilttMM 
buoiAvom cx**%av*c n pomn o* 
wrv» o« wum ctatio do *ot wcu 
sam» wMvsianofitOAioncf omow 
Cato* potmo* o« rvx»t« 



NOTE ON THE SCORING OF FOREIGN LANGUAGE SPEAKING 
AND WRITING FLUENCY TESTS 



John 3. Carroll 



T 

I 

N 



Thle Bulletin if a draft for interoffice circulation, 
i Corrections and suggestions for revision are solicited. 

* The Bulletin should not be cited as a reference without 

the specific permission of the author. It is automati- 
cally superseded upon formal publication of the material. 




Educational Testing Service 
Princeton, Hew Jersey 
September 1970 



EDO 44441 



NOTE ON THE SCORING OR JIFTHH. uAUSUflGH SF5A&UG AND tffilTIK FLUENCY TESTS 



■Jixltix. 3v. larrail 
f-riii nnfjmaZ: Testing Service 

ia sasacr 



Date iron French speaEinp smE vzLtiin f fluency tests ere analysed to suppo 
the suggest! on that scoring sfiottld reflect tuo components of perforna 

(l) quantity of correct renjirgsg>, ante (3) relative quality of response. This 
oay require nonlinear caftiintftiin® of rss - scores * usually , formulas Involving 
ratios or products. 




NOTE ON THE SCORING OF FOREIGN LANGUAGE SPEAKING AID WRITING FLUENCY TESTS 1 

John B. Carroll 
Educational Testing Service 

In the scoring of foreign language speaking and vriting fluency tests, a 
perennial problem has been that of the relative veights to be given to quantity 
and quality of response. If quantity of response is snail, the scores for quality 
tend to be unreliable; on the other hand, if quantity of response is large, the 
scorer is likely to be either overimpressed vith it or negatively influenced by 
it, and scores based on the quantity of correct responses nay consequently be 
either inflated or unfairly decreased. 

An opportunity to study this p rob leu v*s presented in connection vith the 
author's vork in developing a set of speaking and vriting fluency tests in French 
as a foreign language fcr the International Study of Educational Attainment, 
familiarly knovn as I.E.A. (Husen, 196$). 

The Tests 

The speaking fluency test consists of pictures of situations vhich the re- 
spondent is asked to describe in French. There are tvo pictures in the test 
designed for a population of 10-year -old learners (called Population I in the 
I.E.A. study', and the child chooses ons; in the test for older learners (lfc- 
year-olds and pre-university populations, i.e.. Populations II and l v , respectively, 
as defined in the I.E.A. study) there are three pictures (not tee smk as those 
for Population I) from vhich the pupil has to choose tvo to respond to. This 
test is not timed; the child is siaply told to describe the picture in French — 

"to say anything he likes about the picture.* 
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In a preliminary scoring of the speaking fluency test responses, scores were 
assigned to each total response to each picture by a team of native French speakers, 
as follows: 

« number of "p ropositions 1 1 (French for clauses ) in the response 
Xg * number of different grammatical structures represented in the 
response 

X^ c number of propositions with correct structures 

X^ R number of propositions with correct morphology 

X^ » number of propositions with correct vocabulary 

Xg * number of propositions with correct pronunciation 

y ? « number of propositions exhibiting one or more hesitations 
In addition, a global rating on a 5-point scale from 0 to (high), here identi- 
fied as V , was assigned to the total response by this same teas of scorers. 

For the Population I cases, this was assigned on the basis of the response to 
one picture; for the Population II and IV cases, it was based on the responses to 
two pictures. It will be noted that X^^ is a neasure of sheer quantity; X g , X^, 

X^ , X^, and Xg are measures of both quantity and quality; is indicative of 

quantity but, presumably, negative quality. The problem posed by these data was 
to determine a suitable system for combining the X values into a single score 
that vould well predict the global rating , which was regarded as a criterion score. 

The writing tests were slightly different for Populations II and IV; there 
was no writing test for Population I. The Population II test directed the pupil 
to write, within 10 minutes, a six-exchange dialogue between two persons (Louis 
and Paxil), including in the dialogue, in the order given, nine designated words 
or phrases (with any appropriate graaaatical changes necessary). Each exchange 
was required to have at least three words, but could include more if necessary 

"to tell the story clearly." The Population IV test directed the pupil to vrite, 
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vithin 10 minutes, a short "free" composition comparing the merits of living in 
the country and in a big city. Certain "themes" vere suggested, to be used in 
the order given (e.g., advantages of country life — peace and quiet, scenery, good 
$ood, health). For both Populations II and IV, the compositions vere scored by 
nati\e French speakers with respect to three 5-point scales (0 to li): fluency 

or completeness (amount written), grammatical accuracy, and style. AgAin, the 
problem van hov to combine these scores into a single index. For the compositions, 
however, there was no direct criterion. 

Subjects 

The tests were given to pupils in several schools in England where they vere 
being taught French. For the speaking tests, there vere 17 pupils in Population ‘ 

I, 13 in Population II, and 33 in Population IV. For the composition tests, data 
vere available for 28 pupils in Population II and 180 pupils in Population IV. 

Analysis of Results 

Speaking test scores . The first step was to compute and examine the Pearsoni 
correlations among the raw, untransformed, uncombined scores for all three popula- 
tions pooled (N * 63) for the responses to the first picture chosen. (In the css*.? 
of Population I, these were the only data available.) The plan was to develop a 
scoring procedure for the first response and cross-validate it on the second re- 
sponse (available only for Populations II and IV pupils), The correlations thus 
obtained are shown in Table 1. Several initial conclusions vere drawn from this 
table: 

Insert Table 1 about here 

(l) Sheer quantity of response (X^) had little correlation with the global rating 
yet it would be a mistake to omit it from the scoring scheme since it shoved 




-In- 
appreciable correlations with other variables} X^ could be a suppressor variable. 
(2) The highest correlations with the global rating were yielded by variables , 
Xj, X,. , uid X 6 in that order; all these variables appeared to form a rather tight 
cluster. Variable Xg also shoved an appreciable correlation with the criterion 
but smaller correlations with variables X^ through Xg . (3) Variable 7, the 

number of clauses with hesitations, showed a negligible correlation with the 
criterion; its M 3 I 1 correlation with the number of clauses indicated that it was 
primarily another measuro of quantity. 

At this point the standard method of procedure would dictate computing a 

regression equation for the predictor variables. Before such a procedure vas 

followed, however, it was decided to investigate methods of transforming or non- 

linearly combining the measures of quantity and quality. Variables X^ and 

were selected for special study in view of the former’s high correlation with the 

criterion. By making various three-dimensional scat terplots for transformations 

or combi 1 at ions of these variables (the criterion variable being entered as 

numbers to represent the third dimension) it appeared that the b^st procedure for 

combining the variables would be to establish a new variable, Xj^ » , 

and then to compute the optimal weights for X^ and X^ for predicting the 

criterion. The resulting multiple correlation was .6086, with 8 Y a .5271 and 

8 * .3260. This multiple correlation was in fact slightly superior to Ryil» * 

.8035* vith 8 V -.21H8 and 8 - * *871*5. It vas decided, also, that this way 
X 1 X l* 

of combining variables made psychological, sense, in that it represented a postu- 
lated process whereby the scorer takes into account not the sheer quantity of 
response but, rather, two perceptible aspects of the response: (l) the quantity 

of correct response, and ( 2 ) the proportion of the total response that is correct. 
Guch a judgmental process seems intuitively more reasonable than one whereby the 
scorer takes into account the quantity of correct response and then "subtracts" 
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points for the quantity of toted response. If, for example, a respondent produced 
a large quantity of response that was edl correct, there would he no reason (and 
it would be unfair) to penalize him for producing a lengthy response — a procedure 
that would be implied by the straightforward linear combination of the raw scores, 
with ito negative beta-weight for . 

This matter vas later checked by comparing the multiple correlations and beta 
weights for the two procedures as applied to all variables X,, through Xg . The 
results are shovn in Table 2. It will be there observed that, actually , the non- 
linear combination procedure produces higher multiple correlations for only two 

Insert Table 2 about here 



of the variables. Nevertheless, the a p riori line of reasoning developed above 
suggests that the nonlinear combination procedure makes for more sensible and 
fairer results . It was concluded that the final scoring formula should be based 
on the nonlinear combination procedure. 

It was desired that the final scoring formula be as simple as possible to 
apply. It was decided, therefore, to determine optimal weights for two summational 
variables: 



X 8 “ 
*9 ‘ 




X 

X 



♦ X, 



♦ ^ 



+ x 5 + x 6 
+ x 5 ♦ x <j 



)/ \ • 



The results are shown in Table 3. Of interest is the fact that the correlation 

Insert Table 3 about here 

between X fi and X is far from unity, also the fact that the beta-weights for 
o 9 

the two variables are approximately equal, indicating that they make approximately 
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equal independent contributions to the prediction. The multiple correlation with 
the criterion is very appreciably higher than any of the zero-order criterion 
correlations in Table 1, and also higher than any of the multiple correlations 
shown in Table 2. In order to simplify the scoring formula still further, it 
was noted that the ratio of the b-weights was approximately 10. Therefore, ^he 
final scoring formula was defined as follows: 




X 8 ♦ iox 9 . 



The correlation of with Y is very nearly .8537. 

The scoring formula represented by X^^ , developed on the basin of the 
data available for the first speakis^g test, was "cross-validated" by applying 
it to the data for the second speaking test. It will be recalled that data were 
available for the second speaking test only for the 1*6 cases in Populations II 
and IV, a subset of the cases used in developing the scoring formula. Strictly 
speaking, this was not cross-validation in the usual sense of applying a formula 
to a completely different set of cases. The "cross-validation" was in truth a 
matter of applying a scoring formula to a different set of data (an "alternate 
fora" of the test, so to speak) freta the same set of cases, or actually a subset. 
For the 1*6 cases in Populations II and IV, correlations were obtained among 
variables Xg, X^, and X 1Q for both the first and second speaking tests, as 
well as the correlations of these variables with the global rating. The results 
are shown in Table 1*. The scoring formula produced a validity coefficient of 



Insert Table 1* about here 

.80 in the case of the firet speaking test (a figure analogous to the value of 
.85 yielded for the complete set of 63 cases), but the validity shrank to .67 when 
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tbe scoring formula vas applied to the second test. In this sane sample, the 
correlation bet wen the final scores of the first and second speaking test vas 
.73, a value that indicate? the reliability of the scoring formula. (By the 
Speaman-Brovn formula, the reliability of scores combined from both tests for 
Populations II and IV would be estimated as .81*. ) 

It vas of interest to investigate the reasons for the shrinkage in the validity 
of the scoring formula. Correlations were computed among the rav variables of 
the second test for the restricted sample, as veil as vith the global ratings, 
vith results shown In Table 5. It is evident from this table that the structure 

Insert Table 5 about here 

of the variables in the restricted sample is somewhat different from that observed 
in Table 1. Quantity of response (X^) is much more highly correlated vith the 
remainder of the predictor variables, as veil as vith the criterion variable. 

Even the presumably negatively oriented variable (number of clauses vith hesi- 

tations) has an appreciable positive correlation vith the criterion. If ve had 
begun our investigation vith the data of Table 5, it is possible that ve vould not 
have come up vr'th the conclusion that ve arrived at from the data of the first 
test. On the other hand, the second speaking test did not yield tne high correla- 
tions of variables X^ and X^ that vere observed vith the first speaking test. 

In view of the larger and more varied sample that vas available for arriving at 
the scoring formula, as veil as the intuitively persuasive rationale for this formula 
it vas decided to accept it despite the appreciable shrinkage that occurred for the 
data of the second speaking test. 

Another feature of the data that makes the interpretation of the ”cross- 
validation" difficult is the fact that if ve compare the means and standard devia- 
tions shown in Tables 1 acd 5 for the seven rav scores on the first and second 
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speaking tests, the means for the second test (N = 46) are not in every case 
higher than the means on the first test for the complete sample (N = 63 ), as 
we fright expect then to be in view of the fact that Populations II and IV are 
more advanced than Population I cases. Furthermore, the standard deviations 
for the seccnd-tcit scores are in most cases larger than those for the first- 
test scores. These features may be artifacts of the data, due partly to the 
fact that the stimuli for the speaking tests were different between popula- 
tions, or to possible practice effects occurring from the first to the second 
test. 

It is interesting to notice in Table 5, for the cross-validation data, 
that for both the first and the second speaking tests the use of the ratio 
variable produces an increment in the validity of the final scoring fora- 

ula, , over the "number right” variable Xg . 

Writing test scores . As noted previously, there was no appropriate criterion 
for evaluating the writing test scores. On tne assumption that the Population IV 
responses should be on the average better than the Population II responses , a 
n ominal criterion, here called Y , was assigned such that the Population II 
cases had Y = 2 and Population IV had Y = 4 . It was recognized that the 
tests for the two populations differed in important respects, and any differences 
between the populations revealed by the tests would be attenuated by the fact 
that each test had teen geared to a specific range of competence. Also, we must 
recognize that the number of cases in Population II was only 28 as compared to 
the 180 cases in Population IV. Ne rertheless , in the absence of any better 
criterion, it was felt that statistical operations based on optimal weightings 
of scores to differentiate the samples from the two populations would suggest a 
scoring formula that would have some likelihood of holding up against a superior 
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criterion. (it is contemplated that better and more complete data will become 
available at a later time.) 

It was decided to explore the possible generality of the rationale developed 
for the speaking test formula. Recall that there were three scores assigned for 
the writing test, all on a scale from 0 to it: , a measure of the "length" or 

completeness of the response ; Xg , an assessment of the r elative grammatical 
accuracy of the response; and X^ , an assessment of the quality of "style" of 
the response. The rationale developed for the speaking test scoring suggested 
that quantity of correct response and relative correctness of the total response 
should be the two factors considered in a scoring formula. Applying this rationa 
to the writing test scores, we would conclude that Xg end possibly also X^ 
are measures of relative correctness as they stand. To obtain measures of the 
quantity of correct response, however, we should use some function of the product 
of X^ times Xg and/or X^ . To gain insight into the relationship among the 
raw scores and such functions , a matrix of correlations was computed among the 
raw variables, several functions of them, and the nominal criterion Y . The 
functions of the raw scores investigated were: X^Xg, *1*3* *2 + *3 * 

X^(Xg + X^) . The correlation matrix is shown in Table 6. Also, multiple 

Insert Table 6 about here 

regression systems were computed for several combinations of the variables, as 
shown in Table 7* From the results in Table 6, it will be immediately noticed 

Insert Table 7 about here 

that none of the variables correlates highly with the nominal criterion; only cor 
relations equal or greater than .0895 are significantly positive at the 5* level, 
or .1434 at the 1 % level (by a one-tailed test, considered legitimate her'.- becaus 
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we should expect the correlations to be positive). We have already mentioned the 
limitations of the data that are likely to have resulted in such low correlations 
Nevertheless, the correlations of Xg and X^Xg with the criterion are significan 
at the 1 % level. 

In Table 7, the computations for Combinations 1, 2, 3, and 1+ permit us to 
examine whether the nonlinear combinations are superior to the linear combination 
In the case of X^ and Xg , the nonlinear combination is slightly superior, 
supporting the rationale for such a combination. This is not the case for vari- 
ables X^ and X^ , however; in fact, variable Xg receives a negative weight in 
the nonlinear combination. We take as a principle the proposition that a score 
should not receive a negative weight in a scoring formula. It is noteworthy, 
however, that X^Xg receives a positive weight, mid in fact its zero-order cor- 
relation has a higher correlation with the criterion than does Xg in its origin* 
form. This suggests that the assessment of style should enter the scoring formul* 
in the form X^Xg . Adding this fact to the fact that the multiple regression 
for X^ and X^Xg yields approximately equal beta-weights for these variables, 
we conclude that the final scoring formula should possibly be a linear function 
of Xg, X^Xg, and X^g . 

First, however, let us examine the multiple regressions for Combinations 5 
and 6; these are, respectively, for the linear combination of X^, Xg, and Xg, 
and for the variables Xg, Xg, X^Xg, and X.jXg . The nonlinear combination of 
variables yields a slightly higher multiple R than does the linear combination. 
However, because of the negative weights for variables Xg and X^Xg in Combina- 
tion 6 it is not reasonable to use it as a basis for a scoring formula. 

Combination 7 shows the multiple regression system for variables Xg, X^Xg, 
and X^Xg . Unfortunately, variable X.jXg again receives a negative weight of 
appreciable size. Although the multiple correlation is still nearly as high as 
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would be obtained from Combinations 5 or 6, we must reject this multiple regressloi 
system as a basis for a scoring formula. 

At this point ve might decide to eliminate variable completely from the 

scoring formula, but this seems a possibly unfortunate thing to do because it 
loses information. It is possible that the negative weight of arises from 

some sort of sampling error. Under the circumstances, it seems advisable to in- 
clude X^ in some fashion. Considering that variable X^X^ **as ^ >een shown to 
have a reasonably "high" correlation vith the criterion, ve decide to combine it 
with X^Xg and make the scoring formula a linear combination of Xg and X^(Xg + X. 
The multiple regression for such a combination is shown in Table 7 under the headii 
Combination 8. Mow the variable X^(Xp + X^) receives a relatively small weight, 
but at least it remains positive. T^e ratio of the b-weight for the first of 
these variables to the second is about 36. As a quite arbitrary matter, let vis 
prescribe the final scoring formula as, for the sake of simplicity. 

Score = 10Xp + X^(Xg + X^) . 

The coefficient of 10 is used for Xg rather than 36 in order to give relatively 
more weight to the second term in the formula than would be assigned by the multi] 
regression weights. Whereas the multiple correlation of the variables with the 
criterion is .1517* the scoring formula with the coefficient of 10 yields a cor- 
relation nearly as high, namely .lli92. The standard deviations of the two terms 
in the formula are 8.0250 and 5.5352, respectively. Figure 1 depicts the final 
scores that will be obtained for various combinations of scores on X^ , Xg , 
and Xg . It can be seen that the score on Xg is the principal determiner of 

Insert Figure 1 about here 
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the score, but tbe pupal gets pnxnta far quantity or correct response 

in terras of gromar and rrtyiie: ttxte xzxcrenesit. erf score be gets depends upon , 
Ms rating for grea*iaticel nmei tnmr- CTbroousiy , be cannot get a score of other 
than Xg = 0 if he floes mot gxrafimu anjf resgorse; far this reason, the scores 
shown for X^ = 0 on the rhnrt aai* sgnyfmw - Also, it happens that because of 
the correlations wring -the shore,, x mirfw of score cntostiass are extremely 
unlikely to occur. The Augurs sttrcwx am the chart are the actual frequencies 

of the score combinations in ihe dHttx ffapTnyedt for this analysis, and the dis- 
tributions of scores in 'the rttoo jn qpi r.-qrfr rr»rrg acre tixm at the right of the chart. 
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Table 1 

Pearsonian Correlations, Raw Scores and Global Rating, 
First Speaking Test, All Pupils in Populations 
I, II, and IV (N ■ 63 ) 





Variable 




1 


2 


3 


4 


5 


6 


7 




No. 


clauses 


1 


1.00 


.10 


.48 


.44 


.64 


.62 


.91 


• 


No. 


different structures 


2 


.10 


1.00 


.55 


• 59 


.45 


.38 


-.03 


a 


No. 


clauses w/correct structure 


3 


.48 


• 55 


1.00 


.85 


.86 


.71 


.25 


a 


No. 


clauses w/correct morphology 


1 * 


.44 


• 59 


.85 


1.00 


.82 


.72 


.24 


• 


No. 


clauses w/correct vocabulary 


5 


.64 


.45 


.86 


.82 


1.00 


.77 


.42 


• 


No. 


clauses w/correct pronunciation 6 


.62 


.38 


.71 


.72 


.77 


1.00 


.44 


• 


No. 


clauses w/hesitations 


7 


.91 


-.03 


.25 


.24 


.42 


.44 


1.00 




Global rating 


y 


.17 


.61 


• 74 


• 78 


.66 


.63 


-.02 


1. 






Mean 


7.38 


2.57 


4.48 


3.49 


4.22 


2.76 


5.27 


1. 






S.D. 


4.26 


1.46 


2.84 


2.62 


2.86 


2.89 


3.80 


1. 



O 




y 

17 

61 

74 

78 

66 

63 

02 

00 

29 

09 
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Table 2 

Comparison of Linear and Nonlinear Scoring 
Procedures, First Speaking Test (N = 63) 



Variable 

Y 

2 = No. different structures 

= No. clauses w/correct structure 

= No. clauses w/correct morphology 

X^ = No. clauses w/correct vocabulary 

Xs = No. clauses w/correct pronunciation 



Linear Combination Nonlinear Combination 



6 

X 1 


$X i 


\*li 


\ 




R 


.1101 


.5990 


.6197 


.7482 


-.2226 


.6315 


-.24o6 


.8550 


.7693 


.5629 


.2473 


.7567 


-.2lU8 


.871+5 


.8035 


.5271 


.3260 


.8088 


-.>+275 


.9336 


.7372 


.4ll*8 


.3613 


.7067 


-.3583 


.8522 


.6899 


.1931 


.5035 


.6766 
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Table 3 

Correlations and Regression Analysis for Components of 
Final Scoring Formula, First Speaking Test (N = 63 ) 



Correlations 
X 8 X 9 Y 


Kean 


S.D. 


& 


b 


Xg 1.00 


.69 


■ 78 


17.52 


11.14 


.4561 


.0446 


x 9 .69 


1.00 


• 79 


2.U2 


l.ll 


.4521 


. 4644 


Y .78 


.79 


1.00 


1.29 


1.09 


R = 


• 8537 
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Table 1* 

» orrelations Across First and Second Speaking Tests, 





Pupils 


in Population II 


and IV (N 


= 1*6) 










1st Test 






2nd Test 








X 8 


iox 9 


X 10 


X 8 


iox 9 


X 10 


Y 




X 8 


1.00 


.67 


.95 


.70 


.50 


.69 


.76 


1st Test 


10X9 


.67 


1.00 


.87 


.62 


.1*8 


.63 


.70 




X 

H* 

O 


•95 


.87 


1.00 


.73 


• 5^ 


• 73 


.80 




X 8 


.70 


.62 


• 73 


1.00 


.60 


• 95 


.65 


2nd Test 


10X 9 


.50 


.1*8 


• 5U 


.60 


1.00 


.81 


.52 




X 10 


.69 


.63 


• 73 


.95 


.81 


1.00 


.67 


y 


.76 


• 70 


.80 


.65 


.52 


.67 


1.00 


Mean 


20.1*6 


29.75 


50.21 


22.20 


30.12 


52.32 


I.67 


S.D. 


10.52 


6.85 


15-95 


15.71 


8.20 


21.66 


1.00 
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Table 5 

Correlations Anong Original Variables, Second Speaking Test, 
Pupils in Population II and IV (N = 1*6) 





*1 


X 2 


X 3 


X l* 


X 5 


X 6 


X 7 


Y 


X 1 


1.00 


.86 


• 9*+ 


• 89 


• 92 


.83 


.89 


.53 


*2 


.86 


1.00 


.79 


• 73 


.83 


• 7»* 


• 7** 


• 50 


X 3 


.9** 


.79 


1.00 


• 92 


• 92 


.85 


.81* 


.57 


h 


.89 


• 73 


.92 


1.00 


.°' 1 


.85 


■ 77 


.63 


S 


.92 


.83 


• 92 


• 91 


1.00 


.87 


• 79 


.68 


x 6 


.83 


.7^ 


• 85 


.85 


.87 


1.00 


• 70 


.63 




.89 


• 7l* 


.81* 


•77 


• 79 


• 70 


1.00 


.1*8 


Y 


.53 


.50 


• 57 


.63 


.68 


.63 


.1*8 


1.00 


Mean 


6.91 


3.30 


5.6l 


l*.l*6 


5.0U 


3-78 


U.72 


1.67 


S.D. 


*i.03 


2.16 


3.85 


3.31 


3.1*6 


3.93 


2.58 


1.00 
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Teble 6 

Intercorrelations of Selected Functions of Writing Test 
Scores and the Nominal Criterion (Y) 

(N = 208*) 







1 


2 


3 


1* 


5 


6 


7 


8 


X 1 


l 


1.0000 


.111* 09 


.3980 


.7611* 


.7395 


.1*689 


.7971* 


.0921 


*2 


2 


.U09 


1.0000 


.6038 


.8260 


• 5**95 


.9006 


• 7337 


.1509 


X 3 


3 


.3980 


.6038 


1.0000 


• 5307 


.8152 


.8903 


• 7117 


.01*21* 


V 2 


!» 


.7611* 


.8260 


.5307 


1.0000 


• 7722 


.7613 


•91*38 


.ll*7l* 


X 1 X 3 


5 


• 7395 


- 5**95 


.8152 


.7722 


1.0000 


.7585 


.9388 


.0816 




6 


.**689 


.9006 


.8903 


.7613 


.7585 


1.0000 


.8073 


.1093 


X 1 U 2 * V 


7 


-797*» 


• 7337 


•7117 


. 91*38 


.9388 


.8073 


1.0000 


.1223 


r 


8 


.0921 


.1509 


.0l*2l» 


. 1 U 71 * 


.0816 


.1093 


.1223 


1.0000 




Mean 


2.?6kh 


1 . 1*656 


1.1*038 


3.76M* 


3 . 521*0 


2.8891* 


7-2885 


3 . 7308 




S.D. 


1.1318 


.8025 


.7661 


3.0012 


2.8789 


1.1*01*8 


5.5352 


.6826 



* T ac * *0®95» Y 8 .ll«3 , i (one-tailed test). 

f - .U) p - .Ul 



I 
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Table 7 



Multiple Regression Systems for Several Combinations 
of Scores on French Writing Test 



Combination 



1 2 3 ** 





0 b 


0 


*0 


0 


b 


0 b 


x i 


.0318 .0191 






.089** 


.0539 




X 2 


.1369 .1165 


.0916 


.0779 








X 3 








.0068 


.0061 


-.0719 -.06*41 


X 1 X 2 




.0718 


.0163 








X 1 X 3 












.11*01 .0332 


R 


.1536 


.1526 


.0855 


.0757 




5 




6 




1 


--8 




0 b 


0 


b 


0 


b 


0 b 


X 1 


.0*59 .0277 












X 2 


.1826 .155** 


.2735 


.2327 


.07**0 


.0629 


.1323 .1126 


X 3 


-.0861 -.0767 


-.23**3 


-.2088 








X 1 X 2 




-.120*1 


-.027** 


.1353 


.0308 




X 1 X 3 

X 1 (X 2 + V 




.2153 


.0510 


-.0635 


-.0150 


.0251 .0031 


R 


.1677 


.1765 




.1610 


.1517 
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Figure Caption 



Figure 1. Nomograph for final scores, Score « 10 + X^X^ + X^), 

on French writing test, where X^ * fluency or completeness, X^ r - grammatical 
accuracy, and X^ «* style. Each line is labeled with an ordered pair of 
scores on (X^, X^). Numbers in small circles are frequencies of scores 
at the given points. At the right are found the frequency distributions 
of final scores for cases in Population II, Population IV, and the total. 
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